Kai-Wei Chang

Professor, UCLA Computer Science

Co-director, UCLA DataX AI Technology Center

My group studies how AI systems can perceive the world through multimodal models, capture and organize knowledge through language, reason over evidence, follow constraints, and collaborate with people in high-stakes settings. We aim to build models and agents that are capable, reliable, scientifically ambitious, and grounded in human values. I have long-standing experience across academia and industry, connecting foundational research with real-world AI systems.

Research at a glance

Multimodal Foundation Models

We developed VisualBERT, one of the first vision-language models, and follow-ups, including GLIP and DesCo that can recognize objects through detailed language descriptions. Our recent work extends this line to robust video-language alignment and multimodal reasoning; for example, our 7B multimodal reasoning model outperforms GPT-4o on a visual math benchmark.

VisualBERT GLIP OpenVLThinker VideoCon VideoPhy2

Multimodal Agents

We build multimodal agents that reason over evidence, use tools, and act across visual, web, and embodied environments. This includes compositional tool-use systems, long-term memory, and embodied AI tasks that require instruction following, perception, planning, and interaction in the physical and digital world.

Chameleon Agent Lumos Embodied web agents expMath news

Reasoning in LLMs

We developed methods enabling LLMs to adhere to specified constraints and study commonsense, mathematical, logical, and structured reasoning capabilities. Recent projects include controllable generation, open reasoning data recipes, and trajectory-control methods for stronger reasoning models.

Constrained generation HoneyBee Ctrl-R

Frontier AI Evaluation

We challenge frontier AI models with difficult benchmarks in mathematics, data science, and embodied tasks that require advanced reasoning, visual perception, and instruction following. This line of work helps benchmark state-of-the-art systems and clarify their strengths and limitations. MathVista has been used by frontier labs in their model releases, including Gemini and Grok.

MathVista MathVerse 3DLLM-Mem MuirBench

Trustworthy AI

We published pioneering work on aligning AI systems with human values and safety, focusing on fairness in generative AI, robustness, unlearning, and safety in LLMs and multimodal systems. Recent work includes selective unlearning for removing sensitive content and agentic safety reasoning. I also organize Trustworthy NLP workshops at ACL* since 2020.

Gender bias amplification SafeWorld Customized guardrails Unlearning